Notes 11/11: * figure out dcast issues * replacing NA/0s within pipe
Need to determine appropriate size range for comparison. Because they are not evenly distributed (i.e. larger fish in Bonaire, smaller in Barbuda), I will likely want to compare length-feeding relationships as opposed to pooled averages
Potential predictor variables are site-level fish, benthic, and rugosity values. These are likely correlated to one another, and I need to determine which ones I ultimately want to use (if modeling behavioral responses via any multivariate regressions). I can also move to SEM if I want to keep multiple correlated predictors.
First, check distribution of predictor variables of interest: not very normally distributed…
Variable selection notes: - excluding both carnivore variables as they are highly correlated with scarid biomass and total biomass, eventually I could make these more nuanced by distinguishing actual predators, but right now I don’t think it reflects actual predator populations of >15cm parrotfish - rugosity is highly correlated with turf cover, and scarid density - scarid density: removing for now, because I think it was a bit skewed from Barbuda juveniles - could eventually use consp. scarid length as another indicator of overfishing?
PCA to visualize variable relationships:
PCA for correlated (benthic only?) variables:
## Importance of components:
## PC1 PC2 PC3 PC4 PC5
## Standard deviation 1.754 1.1961 0.5865 0.2898 0.25726
## Proportion of Variance 0.615 0.2862 0.0688 0.0168 0.01324
## Cumulative Proportion 0.615 0.9012 0.9700 0.9868 1.00000
Fish-level grazing behaviors (as well as competitive interaction frequency)
Variable selection notes: - for_bites is correlated with fr and for_dur, but I will play around with keeping it for now.
## Df Sum Sq Mean Sq F value Pr(>F)
## island 2 19329868 9664934 24.48 5.08e-09 ***
## Residuals 80 31586112 394826
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = fr ~ island, data = vet)
##
## $island
## diff lwr upr p adj
## Barbuda-Antigua -121.5556 -670.7081 427.5968 0.8575559
## Bonaire-Antigua 936.8111 485.8992 1387.7231 0.0000115
## Bonaire-Barbuda 1058.3667 630.3282 1486.4053 0.0000002
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 0.731 0.4846
## 80
## Df Sum Sq Mean Sq F value Pr(>F)
## island 2 3522230 1761115 26.9 5.52e-10 ***
## Residuals 95 6218718 65460
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = fr ~ island, data = vir, white.adjust = T)
##
## $island
## diff lwr upr p adj
## Barbuda-Antigua 338.84411 180.14219 497.5460 0.0000055
## Bonaire-Antigua 408.89527 267.59360 550.1969 0.0000000
## Bonaire-Barbuda 70.05115 -94.41711 234.5194 0.5698132
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 0.9125 0.405
## 95
## Df Sum Sq Mean Sq F value Pr(>F)
## island 2 1.582 0.7911 22.98 1.3e-08 ***
## Residuals 80 2.753 0.0344
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = g_frac ~ island, data = vet, white.adjust = T)
##
## $island
## diff lwr upr p adj
## Barbuda-Antigua -0.02121967 -0.1833515 0.1409122 0.9476109
## Bonaire-Antigua 0.27575852 0.1426312 0.4088858 0.0000122
## Bonaire-Barbuda 0.29697818 0.1706040 0.4233524 0.0000008
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 0.1837 0.8325
## 80
## Df Sum Sq Mean Sq F value Pr(>F)
## island 2 2.808 1.4040 26.13 9.1e-10 ***
## Residuals 95 5.105 0.0537
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = g_frac ~ island, data = vir, white.adjust = T)
##
## $island
## diff lwr upr p adj
## Barbuda-Antigua 0.30233003 0.15853810 0.4461220 0.0000076
## Bonaire-Antigua 0.36517814 0.23715172 0.4932046 0.0000000
## Bonaire-Barbuda 0.06284811 -0.08616841 0.2118646 0.5760464
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 0.7501 0.4751
## 95
## Df Sum Sq Mean Sq F value Pr(>F)
## island 2 0.314 0.1571 1.448 0.241
## Residuals 80 8.676 0.1085
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = br ~ island, data = vet)
##
## $island
## diff lwr upr p adj
## Barbuda-Antigua -0.008825416 -0.29663426 0.2789834 0.9970480
## Bonaire-Antigua 0.123227748 -0.11309360 0.3595491 0.4303229
## Bonaire-Barbuda 0.132053164 -0.09228034 0.3563867 0.3427971
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 10.408 9.601e-05 ***
## 80
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Df Sum Sq Mean Sq F value Pr(>F)
## island 2 0.045 0.02253 0.502 0.607
## Residuals 95 4.267 0.04491
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = br ~ island, data = vir, white.adjust = T)
##
## $island
## diff lwr upr p adj
## Barbuda-Antigua -0.002050384 -0.1335072 0.12940643 0.9992399
## Bonaire-Antigua -0.045754540 -0.1627983 0.07128921 0.6222646
## Bonaire-Barbuda -0.043704156 -0.1799374 0.09252906 0.7260125
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 10.63 6.823e-05 ***
## 95
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Df Sum Sq Mean Sq F value Pr(>F)
## island 2 1457 728.5 7.298 0.00126 **
## Residuals 76 7586 99.8
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 4 observations deleted due to missingness
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = for_bites ~ island, data = vet)
##
## $island
## diff lwr upr p adj
## Barbuda-Antigua -0.07948718 -9.447089 9.288114 0.9997732
## Bonaire-Antigua 9.09956631 1.707813 16.491319 0.0118676
## Bonaire-Barbuda 9.17905349 1.787300 16.570807 0.0110379
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 2.1906 0.1189
## 76
## Df Sum Sq Mean Sq F value Pr(>F)
## island 2 903.1 451.5 13.19 9.77e-06 ***
## Residuals 88 3012.5 34.2
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 7 observations deleted due to missingness
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = for_bites ~ island, data = vir, white.adjust = T)
##
## $island
## diff lwr upr p adj
## Barbuda-Antigua 3.599465 -0.14464682 7.343576 0.0621841
## Bonaire-Antigua 7.291484 3.90697129 10.675997 0.0000050
## Bonaire-Barbuda 3.692020 -0.09681671 7.480856 0.0578051
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 5.0607 0.00831 **
## 88
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Note: remove G1 grazing instances here?
Notes: - scarid biomass is not the best predictor once I account for differences between my samples in terms of the sizes of fish I was sampling. I think the grazing/length relationships are much stronger.
- restraining sample size to length windows lowers sample size and makes trends much less pronounced - esp. for phase differences
- reducing sample to individual phase only also blurs trends
## Linear mixed-effects model fit by REML
## Data: filter(sum_id_pca1, species_code != "rbp")
## AIC BIC logLik
## 9300.792 9340.572 -4641.396
##
## Random effects:
## Formula: ~1 | site
## (Intercept) Residual
## StdDev: 162.6097 436.1899
##
## Fixed effects: fr ~ phase + length_cm + species + scar_bm + pc1 + pc2
## Value Std.Error DF t-value p-value
## (Intercept) 1314.6465 135.44162 605 9.706370 0.0000
## phaset -157.8237 49.97291 605 -3.158184 0.0017
## length_cm -10.1823 3.50964 605 -2.901238 0.0039
## speciesSparisoma viride -621.9076 36.49852 605 -17.039257 0.0000
## scar_bm -0.0187 0.05582 9 -0.335883 0.7447
## pc1 -154.4773 46.44156 9 -3.326273 0.0089
## pc2 0.4426 44.47607 9 0.009952 0.9923
## Correlation:
## (Intr) phaset lngth_ spcsSv scr_bm pc1
## phaset 0.325
## length_cm -0.560 -0.657
## speciesSparisoma viride -0.202 -0.057 0.068
## scar_bm -0.677 0.021 -0.073 -0.001
## pc1 -0.546 0.007 0.002 -0.042 0.782
## pc2 -0.090 -0.059 0.068 -0.009 0.072 0.047
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.83450026 -0.56865286 -0.01818742 0.54135788 5.22633799
##
## Number of Observations: 621
## Number of Groups: 13
## phase length_cm species scar_bm pc1 pc2
## 1.771002 1.803343 1.010440 2.644180 2.621751 1.011652
To Do as of Nov. 7
* boosted regression trees ecosphere 2017 adrians paper
* species as random effect